{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "9d11321e",
   "metadata": {},
   "source": [
    "# Horizontally Federated XGBoost"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "079c8a84",
   "metadata": {},
   "source": [
    ">The following codes are demos only. It's **NOT for production** due to system security concerns, please **DO NOT** use it directly in production."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "89135d8c",
   "metadata": {},
   "source": [
    "In this tutorial, we will learn how to use SecretFlow to train tree models for horizontal federation. Secretflow provides tree modeling capabilities for horizontal scenarios(SFXgboost), The usage of SFXgboost is similar to XGBoost, you can easily convert your existing XGBoost program into a federated model for SecretFlow."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a7190826",
   "metadata": {},
   "source": [
    "## Xgboost"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "19cb8b99",
   "metadata": {},
   "source": [
    "XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework.\n",
    "\n",
    "official tutorial [XGBoost tutorials](https://xgboost.readthedocs.io/en/latest/tutorials/index.html).\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "dafa8ea9",
   "metadata": {},
   "source": [
    "### prepare secretflow devices "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c40b050a",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2022-08-19 13:47:10.795519: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n"
     ]
    }
   ],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "import secretflow as sf\n",
    "\n",
    "# Check the version of your SecretFlow\n",
    "print('The version of SecretFlow: {}'.format(sf.__version__))\n",
    "\n",
    "# In case you have a running secretflow runtime already.\n",
    "sf.shutdown()\n",
    "\n",
    "sf.init(['alice', 'bob', 'charlie'], address='local')\n",
    "alice, bob, charlie = sf.PYU('alice'), sf.PYU('bob'), sf.PYU('charlie')"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "b5295255",
   "metadata": {},
   "source": [
    "### XGBoost Example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "7ca2f164",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "  from pandas import MultiIndex, Int64Index\n",
      "/home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "  elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "erythema                                      int64\n",
      "scaling                                       int64\n",
      "definite_borders                              int64\n",
      "itching                                       int64\n",
      "koebner_phenomenon                            int64\n",
      "polygonal_papules                             int64\n",
      "follicular_papules                            int64\n",
      "oral_mucosal_involvement                      int64\n",
      "knee_and_elbow_involvement                    int64\n",
      "scalp_involvement                             int64\n",
      "family_history                                int64\n",
      "melanin_incontinence                          int64\n",
      "eosinophils_in_the_infiltrate                 int64\n",
      "pnl_infiltrate                                int64\n",
      "fibrosis_of_the_papillary_dermis              int64\n",
      "exocytosis                                    int64\n",
      "acanthosis                                    int64\n",
      "hyperkeratosis                                int64\n",
      "parakeratosis                                 int64\n",
      "clubbing_of_the_rete_ridges                   int64\n",
      "elongation_of_the_rete_ridges                 int64\n",
      "thinning_of_the_suprapapillary_epidermis      int64\n",
      "spongiform_pustule                            int64\n",
      "munro_microabcess                             int64\n",
      "focal_hypergranulosis                         int64\n",
      "disappearance_of_the_granular_layer           int64\n",
      "vacuolisation_and_damage_of_basal_layer       int64\n",
      "spongiosis                                    int64\n",
      "saw-tooth_appearance_of_retes                 int64\n",
      "follicular_horn_plug                          int64\n",
      "perifollicular_parakeratosis                  int64\n",
      "inflammatory_monoluclear_inflitrate           int64\n",
      "band-like_infiltrate                          int64\n",
      "age                                         float64\n",
      "class                                         int64\n",
      "dtype: object\n",
      "[0]\ttrain-merror:0.01913\n",
      "[1]\ttrain-merror:0.01366\n",
      "[2]\ttrain-merror:0.01366\n",
      "[3]\ttrain-merror:0.00820\n"
     ]
    }
   ],
   "source": [
    "import xgboost as xgb\n",
    "import pandas as pd\n",
    "from secretflow.utils.simulation.datasets import dataset\n",
    "\n",
    "df = pd.read_csv(dataset('dermatology'))\n",
    "df.fillna(value=0)\n",
    "print(df.dtypes)\n",
    "y = df['class']\n",
    "y = y - 1\n",
    "x = df.drop(columns=\"class\")\n",
    "dtrain = xgb.DMatrix(x, y)\n",
    "dtest = dtrain\n",
    "params = {\n",
    "    'max_depth': 4,\n",
    "    'objective': 'multi:softmax',\n",
    "    'min_child_weight': 1,\n",
    "    'max_bin': 10,\n",
    "    'num_class': 6,\n",
    "    'eval_metric': 'merror',\n",
    "}\n",
    "num_round = 4\n",
    "watchlist = [(dtrain, 'train')]\n",
    "bst = xgb.train(params, dtrain, num_round, evals=watchlist, early_stopping_rounds=2)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8c23a85f",
   "metadata": {},
   "source": [
    "### Then, How to do federated xgboost in secretflow?\n",
    "1. Use federate Binning method based on iteration to calculate the global bucket information combined with the data of all sides, which was used as the candidate to enter the subsequent construction procedure.\n",
    "2. The data is input into each Client XGBoost engine to calculate G & H.\n",
    "3. Train federated boosting model  \n",
    "   1) Data is reassigned to the node to be split.  \n",
    "   2) The sum of grad and the sum of hess are calculated according to the previously calculated binning buckets.  \n",
    "   3) Send the sum of grad and the sum of hess to server，server use secure aggregation to produce global summary，then choose best split point，Send best split info back to clients.  \n",
    "   4) Clients Updates local model.  \n",
    "4. Finish training，and save model."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "3be5325d",
   "metadata": {},
   "source": [
    "Create 3 entities in the Secretflow environment [Alice, Bob, Charlie]. Where `Alice` and `Bob` are clients, and `Charlie` is the server,then you can happily start `Federate Boosting`."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "67d6f007",
   "metadata": {},
   "source": [
    "###  Prepare Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "54ac3a76",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(pid=3817970)\u001b[0m 2022-08-19 13:47:17.904107: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817969)\u001b[0m 2022-08-19 13:47:17.904107: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n"
     ]
    }
   ],
   "source": [
    "from secretflow.data.horizontal import read_csv\n",
    "from secretflow.security.aggregation import SecureAggregator\n",
    "from secretflow.security.compare import SPUComparator\n",
    "from secretflow.utils.simulation.datasets import load_dermatology\n",
    "\n",
    "aggr = SecureAggregator(charlie, [alice, bob])\n",
    "spu = sf.SPU(sf.utils.testing.cluster_def(['alice', 'bob']))\n",
    "comp = SPUComparator(spu)\n",
    "data = load_dermatology(parts=[alice, bob], aggregator=aggr, comparator=comp)\n",
    "data.fillna(value=0, inplace=True)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "baffdd20",
   "metadata": {},
   "source": [
    "### Prepare Params"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2d51d646",
   "metadata": {},
   "outputs": [],
   "source": [
    "params = {\n",
    "    # XGBoost parameter tutorial\n",
    "    # https://xgboost.readthedocs.io/en/latest/parameter.html\n",
    "    'max_depth': 4,  # max depth\n",
    "    'eta': 0.3,  # learning rate\n",
    "    'objective': 'multi:softmax',  # objection function，support \"binary:logistic\",\"reg:logistic\",\"multi:softmax\",\"multi:softprob\",\"reg:squarederror\"\n",
    "    'min_child_weight': 1,  # The minimum value of weight\n",
    "    'lambda': 0.1,  # L2 regularization term on weights (xgb's lambda)\n",
    "    'alpha': 0,  # L1 regularization term on weights (xgb's alpha)\n",
    "    'max_bin': 10,  # Max num of binning\n",
    "    'num_class': 6,  # Only required in multi-class classification\n",
    "    'gamma': 0,  # Same to min_impurity_split,The minimux gain for a split\n",
    "    'subsample': 1.0,  # Subsample rate by rows\n",
    "    'colsample_bytree': 1.0,  # Feature selection rate by tree\n",
    "    'colsample_bylevel': 1.0,  # Feature selection rate by level\n",
    "    'eval_metric': 'merror',  # supported eval metric：\n",
    "    # 1. rmse\n",
    "    # 2. rmsle\n",
    "    # 3. mape\n",
    "    # 4. logloss\n",
    "    # 5. error\n",
    "    # 6. error@t\n",
    "    # 7. merror\n",
    "    # 8. mlogloss\n",
    "    # 9. auc\n",
    "    # 10. aucpr\n",
    "    # Special params in SFXgboost\n",
    "    # Required\n",
    "    'hess_key': 'hess',  # Required, Mark hess columns, optionally choosing a column name that is not in the data set\n",
    "    'grad_key': 'grad',  # Required，Mark grad columns, optionally choosing a column name that is not in the data set\n",
    "    'label_key': 'class',  # Required，ark label columns, optionally choosing a column name that is not in the data set\n",
    "}"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "57bf92f0",
   "metadata": {},
   "source": [
    "### Create SFXgboost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "4bde4412",
   "metadata": {},
   "outputs": [],
   "source": [
    "from secretflow.ml.boost.homo_boost import SFXgboost\n",
    "\n",
    "bst = SFXgboost(server=charlie, clients=[alice, bob])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5c6714ee",
   "metadata": {},
   "source": [
    "run SFXgboost"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "64d5e208",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(_run pid=3817967)\u001b[0m 2022-08-19 13:48:05.541675: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(_run pid=3817967)\u001b[0m 2022-08-19 13:48:07,217,217 WARNING [xla_bridge.py:backends:265] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)\n",
      "\u001b[2m\u001b[36m(_run pid=3817957)\u001b[0m 2022-08-19 13:48:07.943512: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817961)\u001b[0m 2022-08-19 13:48:08.108831: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817959)\u001b[0m 2022-08-19 13:48:08.068793: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817954)\u001b[0m 2022-08-19 13:48:08.108831: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817964)\u001b[0m 2022-08-19 13:48:08.108831: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817963)\u001b[0m 2022-08-19 13:48:08.111619: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817956)\u001b[0m 2022-08-19 13:48:08.108832: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817955)\u001b[0m 2022-08-19 13:48:08.127188: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817960)\u001b[0m 2022-08-19 13:48:08.157280: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817962)\u001b[0m 2022-08-19 13:48:08.127188: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(pid=3817968)\u001b[0m 2022-08-19 13:48:08.140477: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/rh/rh-ruby25/root/usr/local/lib64:/opt/rh/rh-ruby25/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib64:/opt/rh/devtoolset-11/root/usr/lib:/opt/rh/devtoolset-11/root/usr/lib64/dyninst:/opt/rh/devtoolset-11/root/usr/lib/dyninst\n",
      "\u001b[2m\u001b[36m(_run pid=3817957)\u001b[0m 2022-08-19 13:48:09,720,720 WARNING [xla_bridge.py:backends:265] No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)\n",
      "\u001b[2m\u001b[36m(pid=3817954)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(pid=3817954)\u001b[0m   from pandas import MultiIndex, Int64Index\n",
      "\u001b[2m\u001b[36m(_run pid=3817964)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(_run pid=3817964)\u001b[0m   from pandas import MultiIndex, Int64Index\n",
      "\u001b[2m\u001b[36m(pid=3817963)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/compat.py:36: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(pid=3817963)\u001b[0m   from pandas import MultiIndex, Int64Index\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m /home/fengjun.feng/miniconda3/envs/py3.8/lib/python3.8/site-packages/xgboost/data.py:262: FutureWarning: pandas.Int64Index is deprecated and will be removed from pandas in a future version. Use pandas.Index with the appropriate dtype instead.\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m   elif isinstance(data.columns, (pd.Int64Index, pd.RangeIndex)):\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [0]\ttrain-merror:0.01366\tvalid-merror:0.01366\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [0]\ttrain-merror:0.01366\tvalid-merror:0.01366\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [0]\ttrain-merror:0.01366\tvalid-merror:0.01366\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [1]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [1]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [1]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [2]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [2]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [2]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [3]\ttrain-merror:0.01093\tvalid-merror:0.01093\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [3]\ttrain-merror:0.01093\tvalid-merror:0.01093\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [3]\ttrain-merror:0.01093\tvalid-merror:0.01093\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [4]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [4]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [4]\ttrain-merror:0.00820\tvalid-merror:0.00820\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817964)\u001b[0m [5]\ttrain-merror:0.00546\tvalid-merror:0.00546\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817954)\u001b[0m [5]\ttrain-merror:0.00546\tvalid-merror:0.00546\n",
      "\u001b[2m\u001b[36m(HomoBooster pid=3817963)\u001b[0m [5]\ttrain-merror:0.00546\tvalid-merror:0.00546\n"
     ]
    }
   ],
   "source": [
    "bst.train(data, data, params=params, num_boost_round=6)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "56f1ee3c",
   "metadata": {},
   "source": [
    "Now our Federated XGBoost training is complete, where the BST is the federated Boost object."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "fcebea9b",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "* This tutorial introduces how to use tree models for training etc.\n",
    "* SFXgboost encapsulates the logic of the federated subtree model. Sfxgboost trained models remain compatible with XGBoost, and we can directly use the existing infrastructure for online prediction and so on.\n",
    "* Next, you can try SFXgboost on your data, just need to follow this tutorial.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "db45a4cb4cd37a8de684dfb7fcf899b68fccb8bd32d97c5ad13e5de1245c0986"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}